AITopics | original variable

Variable selection plays a crucial role in enhancing modeling effectiveness across diverse fields, addressing the challenges posed by high-dimensional datasets of correlated variables. This work introduces a novel approach namely Knockoff with over-parameterization (Knoop) to enhance Knockoff filters for variable selection. Specifically, Knoop first generates multiple knockoff variables for each original variable and integrates them with the original variables into an over-parameterized Ridgeless regression model. For each original variable, Knoop evaluates the coefficient distribution of its knockoffs and compares these with the original coefficients to conduct an anomaly-based significance test, ensuring robust variable selection. Extensive experiments demonstrate superior performance compared to existing methods in both simulation and real-world datasets. Knoop achieves a notably higher Area under the Curve (AUC) of the Receiver Operating Characteristic (ROC) Curve for effectively identifying relevant variables against the ground truth by controlled simulations, while showcasing enhanced predictive accuracy across diverse regression and classification tasks. The analytical results further backup our observations.

artificial intelligence, coefficient, machine learning, (15 more...)

arXiv.org Machine Learning

2501.17889

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Invariant filtering for wheeled vehicle localization with unknown wheel radius and unknown GNSS lever arm

Chauchat, Paul, Bonnabel, Silvère, Barrau, Axel

arXiv.org Artificial IntelligenceSep-11-2024

We consider the problem of observer design for a nonholonomic car (more generally a wheeled robot) equipped with wheel speeds with unknown wheel radius, and whose position is measured via a GNSS antenna placed at an unknown position in the car. In a tutorial and unified exposition, we recall the recent theory of two-frame systems within the field of invariant Kalman filtering. We then show how to adapt it geometrically to address the considered problem, although it seems at first sight out of its scope. This yields an invariant extended Kalman filter having autonomous error equations, and state-independent Jacobians, which is shown to work remarkably well in simulations. The proposed novel construction thus extends the application scope of invariant filtering.

invariant, observer, state space, (17 more...)

arXiv.org Artificial Intelligence

2409.0705

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Automobiles & Trucks (0.66)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (0.34)

Add feedback

Applying Dimensionality Reduction with PCA to Cancer Data

#artificialintelligenceJul-25-2020, 19:20:41 GMT

Principal Component Analysis (PCA) is a powerful and well-established data transformation method that can be used for data visualization, dimensionality reduction, and possibly improved performance with supervised learning tasks. In this use case blog, we examine a dataset consisting of measurements of benign and malignant tumors which are computed from digital images of a fine needle aspirate of breast mass tissue. Specifically, these 30 variables describe specific characteristics of the cell nuclei present in the images, such as texture symmetry and radius. The first step in applying PCA to this process was to see if we can more easily visualize separation between the malignant and benign classes in two dimensions. To do this, we first divide our dataset into train and test sets and perform the PCA using only the training data.

artificial intelligence, dataset, machine learning, (13 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Oncology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.52)

Add feedback

Interventions and Counterfactuals in Tractable Probabilistic Models: Limitations of Contemporary Transformations

Papantonis, Ioannis, Belle, Vaishak

arXiv.org Artificial IntelligenceJan-29-2020

In recent years, there has been an increasing interest in studying causality-related properties in machine learning models generally, and in generative models in particular. While that is well motivated, it inherits the fundamental computational hardness of probabilistic inference, making exact reasoning intractable. Probabilistic tractable models have also recently emerged, which guarantee that conditional marginals can be computed in time linear in the size of the model, where the model is usually learned from data. Although initially limited to low tree-width models, recent tractable models such as sum product networks (SPNs) and probabilistic sentential decision diagrams (PSDDs) exploit efficient function representations and also capture high tree-width models. In this paper, we ask the following technical question: can we use the distributions represented or learned by these models to perform causal queries, such as reasoning about interventions and counterfactuals? By appealing to some existing ideas on transforming such models to Bayesian networks, we answer mostly in the negative. We show that when transforming SPNs to a causal graph interventional reasoning reduces to computing marginal distributions; in other words, only trivial causal reasoning is possible. For PSDDs the situation is only slightly better. We first provide an algorithm for constructing a causal graph from a PSDD, which introduces augmented variables. Intervening on the original variables, once again, reduces to marginal distributions, but when intervening on the augmented variables, a deterministic but nonetheless causal-semantics can be provided for PSDDs.

node, original variable, psdd, (17 more...)

arXiv.org Artificial Intelligence

2001.10905

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Variable selection with false discovery rate control in deep neural networks

Song, Zixuan, Li, Jun

arXiv.org Machine LearningSep-16-2019

Deep neural networks (DNNs) are famous for their high prediction accuracy, but they are also known for their black-box nature and poor interpretability. We consider the problem of variable selection, that is, selecting the input variables that have significant predictive power on the output, in DNNs. We propose a backward elimination procedure called SurvNet, which is based on a new measure of variable importance that applies to a wide variety of networks. More importantly, SurvNet is able to estimate and control the false discovery rate of selected variables, while no existing methods provide such a quality control. Further, SurvNet adaptively determines how many variables to eliminate at each step in order to maximize the selection efficiency. To study its validity, SurvNet is applied to image data and gene expression data, as well as various simulation datasets.

neural network, selection, survnet, (15 more...)

arXiv.org Machine Learning

1909.07561

Country:

Oceania > Australia > South Australia (0.04)
North America > United States > New York (0.04)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimal whitening and decorrelation

Kessy, Agnan, Lewin, Alex, Strimmer, Korbinian

arXiv.org Machine LearningDec-17-2016

Whitening, or sphering, is a common preprocessing step in statistical analysis to transform random variables to orthogonality. However, due to rotational freedom there are infinitely many possible whitening procedures. Consequently, there is a diverse range of sphering methods in use, for example based on principal component analysis (PCA), Cholesky matrix decomposition and zero-phase component analysis (ZCA), among others. Here we provide an overview of the underlying theory and discuss five natural whitening procedures. Subsequently, we demonstrate that investigating the cross-covariance and the cross-correlation matrix between sphered and original variables allows to break the rotational invariance and to identify optimal whitening transformations. As a result we recommend two particular approaches: ZCA-cor whitening to produce sphered variables that are maximally similar to the original variables, and PCA-cor whitening to obtain sphered variables that maximally compress the original variables.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Machine Learning

doi: 10.1080/00031305.2016.1277159

1512.00809

Genre:

Overview (0.54)
Research Report (0.50)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Dimension Reduction and Intuitive Feature Engineering for Machine Learning

#artificialintelligenceSep-28-2016, 15:05:38 GMT

In the previous parts of this series, we looked at an overview of some popular tricks for feature engineering, and examined those tricks in greater detail. In this part, we continue our closer examination of these approaches with a deeper dive into the final techniques described in Part 1. The examples discussed in this article can be reproduced with the source code and datasets available here. As an analyst, you savor the scenario in which you have a lot of data. But, with a lot of data comes the added complexity of analyzing and making better sense of that data.

artificial intelligence, machine learning, principal component, (12 more...)

#artificialintelligence

Country: Europe > Italy (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.42)

Add feedback

Dimension Reduction and Intuitive Feature Engineering for Machine Learning

#artificialintelligenceJul-23-2016, 07:21:35 GMT

In the previous parts of this series, we looked at an overview of some popular tricks for feature engineering, and examined those tricks in greater detail. In this part, we continue our closer examination of these approaches with a deeper dive into the final techniques described in Part 1. The examples discussed in this article can be reproduced with the source code and datasets available here. As an analyst, you savor the scenario in which you have a lot of data. But, with a lot of data comes the added complexity of analyzing and making better sense of that data.

artificial intelligence, machine learning, principal component, (12 more...)

#artificialintelligence

Country: Europe > Italy (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.42)

Add feedback

Binary Encodings of Non-binary Constraint Satisfaction Problems: Algorithms and Experimental Results

Samaras, N., Stergiou, K.

arXiv.org Artificial IntelligenceSep-26-2011

A non-binary Constraint Satisfaction Problem (CSP) can be solved directly using extended versions of binary techniques. Alternatively, the non-binary problem can be translated into an equivalent binary one. In this case, it is generally accepted that the translated problem can be solved by applying well-established techniques for binary CSPs. In this paper we evaluate the applicability of the latter approach. We demonstrate that the use of standard techniques for binary CSPs in the encodings of non-binary problems is problematic and results in models that are very rarely competitive with the non-binary representation. To overcome this, we propose specialized arc consistency and search algorithms for binary encodings, and we evaluate them theoretically and empirically. We consider three binary representations; the hidden variable encoding, the dual encoding, and the double encoding. Theoretical and empirical results show that, for certain classes of non-binary constraints, binary encodings are a competitive option, and in many cases, a better one than the non-binary representation.

artificial intelligence, constraint, constraint-based reasoning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.1776

1109.5714

Country: Europe (0.28)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Binary Encodings of Non-binary Constraint Satisfaction Problems: Algorithms and Experimental Results

Samaras, N., Stergiou, K.

Journal of Artificial Intelligence ResearchNov-2-2005

A non-binary Constraint Satisfaction Problem (CSP) can be solved directly using extended versions of binary techniques. Alternatively, the non-binary problem can be translated into an equivalent binary one. In this case, it is generally accepted that the translated problem can be solved by applying well-established techniques for binary CSPs. In this paper we evaluate the applicability of the latter approach. We demonstrate that the use of standard techniques for binary CSPs in the encodings of non-binary problems is problematic and results in models that are very rarely competitive with the non-binary representation. To overcome this, we propose specialized arc consistency and search algorithms for binary encodings, and we evaluate them theoretically and empirically. We consider three binary representations; the hidden variable encoding, the dual encoding, and the double encoding. Theoretical and empirical results show that, for certain classes of non-binary constraints, binary encodings are a competitive option, and in many cases, a better one than the non-binary representation.

algorithm, constraint, non-binary representation, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1776

AI Access Foundation

10428

Journal of Artificial Intelligence Research

Country:

Europe > Greece (0.04)
Europe > North Macedonia (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)

Add feedback

Filters

Collaborating Authors

original variable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection

Invariant filtering for wheeled vehicle localization with unknown wheel radius and unknown GNSS lever arm

Applying Dimensionality Reduction with PCA to Cancer Data

Interventions and Counterfactuals in Tractable Probabilistic Models: Limitations of Contemporary Transformations

Variable selection with false discovery rate control in deep neural networks

Optimal whitening and decorrelation

Dimension Reduction and Intuitive Feature Engineering for Machine Learning

Dimension Reduction and Intuitive Feature Engineering for Machine Learning

Binary Encodings of Non-binary Constraint Satisfaction Problems: Algorithms and Experimental Results

Binary Encodings of Non-binary Constraint Satisfaction Problems: Algorithms and Experimental Results